Graph convolutional neural networks have shown significant potential in natural and histopathology images. However, their use has only been studied in a single magnification or multi-magnification with late fusion. In order to leverage the multi-magnification information and early fusion with graph convolutional networks, we handle different embedding spaces at each magnification by introducing the Multi-Scale Relational Graph Convolutional Network (MS-RGCN) as a multiple instance learning method. We model histopathology image patches and their relation with neighboring patches and patches at other scales (i.e., magnifications) as a graph. To pass the information between different magnification embedding spaces, we define separate message-passing neural networks based on the node and edge type. We experiment on prostate cancer histopathology images to predict the grade groups based on the extracted features from patches. We also compare our MS-RGCN with multiple state-of-the-art methods with evaluations on both source and held-out datasets. Our method outperforms the state-of-the-art on both datasets and especially on the classification of grade groups 2 and 3, which are significant for clinical decisions for patient management. Through an ablation study, we test and show the value of the pertinent design features of the MS-RGCN.
translated by 谷歌翻译
Previous work has shown the potential of deep learning to predict renal obstruction using kidney ultrasound images. However, these image-based classifiers have been trained with the goal of single-visit inference in mind. We compare methods from video action recognition (i.e. convolutional pooling, LSTM, TSM) to adapt single-visit convolutional models to handle multiple visit inference. We demonstrate that incorporating images from a patient's past hospital visits provides only a small benefit for the prediction of obstructive hydronephrosis. Therefore, inclusion of prior ultrasounds is beneficial, but prediction based on the latest ultrasound is sufficient for patient risk stratification.
translated by 谷歌翻译
Online media data, in the forms of images and videos, are becoming mainstream communication channels. However, recent advances in deep learning, particularly deep generative models, open the doors for producing perceptually convincing images and videos at a low cost, which not only poses a serious threat to the trustworthiness of digital information but also has severe societal implications. This motivates a growing interest of research in media tampering detection, i.e., using deep learning techniques to examine whether media data have been maliciously manipulated. Depending on the content of the targeted images, media forgery could be divided into image tampering and Deepfake techniques. The former typically moves or erases the visual elements in ordinary images, while the latter manipulates the expressions and even the identity of human faces. Accordingly, the means of defense include image tampering detection and Deepfake detection, which share a wide variety of properties. In this paper, we provide a comprehensive review of the current media tampering detection approaches, and discuss the challenges and trends in this field for future research.
translated by 谷歌翻译
Change point detection (CPD) methods aim to detect abrupt changes in time-series data. Recent CPD methods have demonstrated their potential in identifying changes in underlying statistical distributions but often fail to capture complex changes in the correlation structure in time-series data. These methods also fail to generalize effectively, as even within the same time-series, different kinds of change points (CPs) may arise that are best characterized by different types of time-series perturbations. To address this issue, we propose TiVaCPD, a CPD methodology that uses a time-varying graphical lasso based method to identify changes in correlation patterns between features over time, and combines that with an aggregate Kernel Maximum Mean Discrepancy (MMD) test to identify subtle changes in the underlying statistical distributions of dynamically established time windows. We evaluate the performance of TiVaCPD in identifying and characterizing various types of CPs in time-series and show that our method outperforms current state-of-the-art CPD methods for all categories of CPs.
translated by 谷歌翻译
当不可用的数据不可用时,在电子商务行业中通常使用强盗算法来培训机器学习(ML)系统。但是,行业的设置提出了各种挑战,使实践中实施强盗算法的挑战是非平凡的。在本文中,我们详细阐述了非政策优化,延迟奖励,概念漂移,奖励设计和业务规则限制的挑战。我们的主要贡献是对开放匪徒(OBP)框架的扩展。我们为一些上述挑战提供模拟组件,以使未来的从业者,研究人员和教育工作者提供资源,以应对电子商务行业遇到的挑战。
translated by 谷歌翻译
文本VQA旨在回答需要了解图像中文本提示的问题。尽管现有的文本VQA方法取得了长足的进步,但它们的性能仍遭受了人类标记的问题解答(QA)对不足。但是,我们观察到,通常在现有数据集中没有完全利用场景文本 - 每个图像中只有一小部分文本参与了带注释的QA活动。这导致大量有用的信息浪费。为了解决这种缺陷,我们开发了一种新方法来通过明确利用每个图像的场景上下文中可用的现有文本来生成高质量和多样化的质量质量对。具体而言,我们建议,TAG是一种文本感知的视觉问题 - 答案生成的结构,该结构学会使用多模式变压器来生成有意义且准确的QA样品。该体系结构通过将生成的QA对与初始培训数据相结合,从而利用了未充满激光的场景文本信息,并增强了文本VQA模型的场景理解。对两个众所周知的Text-VQA基准(TextVQA和ST-VQA)的广泛实验结果表明,我们提议的标签有效地扩大了训练数据,有助于提高文本VQA性能而无需额外的标签努力。此外,我们的模型优于预先通过大规模数据进行训练的最先进方法。代码将公开可用。
translated by 谷歌翻译
已经提出了几种类型的依赖关系,用于对存在规则本体的静态分析,有望对计算属性的见解以及一组规则(例如,基于本体的查询答案)的实际使用。不幸的是,这些依赖性很少实施,因此在实践中几乎没有实现它们的潜力。我们专注于两种规则依赖性 - 积极的relians和限制 - 以及为其有效计算设计和实施优化的算法。关于多达100,000多个规则的现实本体论实验显示了我们方法的可扩展性,这使我们能够实现一些先前提出的应用程序作为实际案例研究。特别是,我们可以在何种程度上分析基于规则的自下而上的推理方法可以保证在实际本体论中产生无冗余的“精益”知识图(所谓的核心)。
translated by 谷歌翻译
我们提出了神经空间填充曲线(SFC),这是一种数据驱动的方法,用于推断一组图像的基于上下文的扫描顺序。像素的线性排序构成了许多应用程序的基础,例如用于图像的生成建模中的视频扰动,压缩和自动回归模型。现有的算法诉诸固定扫描算法,例如栅格扫描或希尔伯特扫描。取而代之的是,我们的工作使用基于图的神经网络从图像数据集中学习了像素的空间连贯的线性顺序。当图像与扫描线顺序一起遍历时,对所得神经SFC进行了优化,适用于适合下游任务的物镜。我们展示了在下游应用中使用神经SFC(例如图像压缩)的优势。代码和其他结果将在https://hywang66.github.io/publication/neuralsfc上提供。
translated by 谷歌翻译
由于精确定位传感器,人工智能(AI)的安全功能,自动驾驶系统,连接的车辆,高通量计算和边缘计算服务器的技术进步,驾驶安全分析最近经历了前所未有的改进。特别是,深度学习(DL)方法授权音量视频处理,从路边单元(RSU)捕获的大型视频中提取与安全相关的功能。安全指标是调查崩溃和几乎冲突事件的常用措施。但是,这些指标提供了对整个网络级流量管理的有限见解。另一方面,一些安全评估工作致力于处理崩溃报告,并确定与道路几何形状,交通量和天气状况相关的崩溃的空间和时间模式。这种方法仅依靠崩溃报告,而忽略了交通视频的丰富信息,这些信息可以帮助确定违规行为在崩溃中的作用。为了弥合这两个观点,我们定义了一组新的网络级安全指标(NSM),以通过处理RSU摄像机拍摄的图像来评估交通流的总体安全性。我们的分析表明,NSM显示出与崩溃率的显着统计关联。这种方法与简单地概括单个崩溃分析的结果不同,因为所有车辆都有助于计算NSM,而不仅仅是碰撞事件所涉及的NSM。该视角将交通流量视为一个复杂的动态系统,其中某些节点的动作可以通过网络传播并影响其他节点的崩溃风险。我们还提供了附录A中的代孕安全指标(SSM)的全面审查。
translated by 谷歌翻译
假设我们观察一个随机向量$ x $从一个具有未知参数的已知家庭中的一些分发$ p $。我们问以下问题:什么时候可以将$ x $分为两部分$ f(x)$和$ g(x)$,使得两部分都足以重建$ x $自行,但两者都可以恢复$ x $完全,$(f(x),g(x))$的联合分布是贸易的吗?作为一个例子,如果$ x =(x_1,\ dots,x_n)$和$ p $是一个产品分布,那么对于任何$ m <n $,我们可以将样本拆分以定义$ f(x)=(x_1 ,\ dots,x_m)$和$ g(x)=(x_ {m + 1},\ dots,x_n)$。 Rasines和Young(2021)提供了通过使用$ x $的随机化实现此任务的替代路线,并通过加性高斯噪声来实现高斯分布数据的有限样本中的选择后推断和非高斯添加剂模型的渐近。在本文中,我们提供更一般的方法,可以通过借助贝叶斯推断的思路在有限样本中实现这种分裂,以产生(频繁的)解决方案,该解决方案可以被视为数据分裂的连续模拟。我们称我们的方法数据模糊,作为数据分割,数据雕刻和P值屏蔽的替代方案。我们举例说明了一些原型应用程序的方法,例如选择趋势过滤和其他回归问题的选择后推断。
translated by 谷歌翻译